{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# LAB 03.01 - Model Generation" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "!wget --no-cache -O init.py -q https://raw.githubusercontent.com/fagonzalezo/ai4eng-unal/main/content/init.py\n", "import init; init.init(force_download=False); init.get_weblink()\n", "init.endpoint" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from local.lib.rlxmoocapi import submit, session\n", "session.LoginSequence(endpoint=init.endpoint, course_id=init.course_id, lab_id=\"L03.01\", varname=\"student\");" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "from sklearn.datasets import make_moons\n", "from local.lib import mlutils\n", "from IPython.display import Image\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A machine learning task\n", "\n", "We have two species of bugs (**X bugs** and **Z bugs**), for each bug we have measured its **width** and **length**. Once we have a bug, determining if is of **species X** or **species Z** is very costly (lab analysis, etc.)\n", "\n", "**Machine learning goal**: We want to create a model so that, when given the width and length of a bug, will tell us whether it belongs to **species X** or **species Z**. If the model performs well, we might use it insted of the lab analysis.\n", "\n", "**To train a machine learning model** we built a **training dataset** where we have **annotated** 20 bugs with their **confirmed** species. The training dataset has:\n", "\n", "- 20 data items\n", "- two data columns (**width** and **length**)\n", "- one label column, with two unique values: **0 for species X**, and **1 for species Z**.\n" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(20, 3) (20, 2) (20,)\n", "[[0.5 0.65]\n", " [0.75 0.34]\n", " [0.37 0.5 ]\n", " [0.57 0.74]\n", " [1. 0.69]]\n", "[0. 1. 1. 0. 1.]\n" ] }, { "data": { "text/html": [ "
\n", " | width | \n", "height | \n", "y | \n", "
---|---|---|---|
0 | \n", "0.50 | \n", "0.65 | \n", "0.0 | \n", "
1 | \n", "0.75 | \n", "0.34 | \n", "1.0 | \n", "
2 | \n", "0.37 | \n", "0.50 | \n", "1.0 | \n", "
3 | \n", "0.57 | \n", "0.74 | \n", "0.0 | \n", "
4 | \n", "1.00 | \n", "0.69 | \n", "1.0 | \n", "